Introduction

Dataset

We ran simulations with Netlogo model (see Github mathjoss/bayes-in-network), using different combination of parameters (structure of the network, percentage of biased people, shape of the bias…) and measuring multiple variables related to the language value of the population.

The results are stored in 3 different files :

Input file Format DV IV Number of replications Maximum number of ticks
example_time.csv CSV file language value (all agents, biased agents, unbiased agents) Scale-free, 500 agents, SAM, 10% biased agents, initial language 100 1000
analysis.csv CSV file - language value (all agents, biased agents, unbiased agents)
- stabilisation time (all agents, biased agents, unbiased agents)
- communities (mean and std of the language value + number of agents in each community)
set of combination 1 (see Set of combination 1 - analysis.csv) 100 1000
extra_analysis.csv CSV file language value (all agents) set of combination 2 (see Set of combination 2 - extra_analysis.csv) 50 500

Framework - methods

Our framework is implemented in NetLogo 6.1.1 (https://ccl.northwestern.edu/netlogo/), the experiments were run on an Intel Core i7-8700, 32Gb RAM system under Ubuntu 18.04, and the results analysed using R 3.6.3/Rstudio 1.4 on machines running Ubuntu 18.04 and macOS 10.15 (Catalina); the full source code and results are available at Github (mathjoss/bayes-in-network).

Our simulation framework is based on previously published models (Dediu, 2009), (Dediu, 2008) and has three main components: the language, the agents, and the communicative network. The language is modelled here as being composed of one (or more) binary features, that are obligatorily expressed in each individual utterance produced or perceived by the agents. We may think of these abstract features as representing, for instance, the use of the alveolar trill /r/ (value 1) or of a different r-like sound (value 0), the use of pitch to make a linguistic distinction (1) or not (0), having a subject-verb word order (1) or a verb-subject order (0), making a gender distinction (1) or not (0), using center embedding (1) or not (0), or any other number of such alternatives. Thus, if we take the /r/ interpretation, a set of utterances {1,1,1} might be produced by an agent that can trill without issues, a {0,0,0} by one that cannot, and {1,0,1} by an agent that either does not make the distinction or whose ability to trill is affected by other factors (e.g., socio-linguistic or co-articulatory). Each agent embodies three components: language acquisition, the internal representation of language, and the production of utterances. The first concerns the way observed data (in the form of “heard” utterances) affect (or not) the internal representation of language that the agent has. The second is the manner in which the agent maintains the information about language. And the third, the way the agent uses its internal representation of the language to produce actual utterances.

We opted here for a Bayesian model of language evolution as introduced by (Griffiths & Kalish, 2007), and widely used in recent studies of language evolution and change (e.g. (Dediu, 2009), (Dediu, 2008), (Kirby, Dowman, & Griffiths, 2007), among others). To do so, we used agent-based modeling in Netlogo, where we created societies of agents sharing connections with each other. Netlogo programs is available on Github mathjoss/bayes-in-network and contains a lot of functionalities not used in this analysis. To understand how to use our Netlogo code and parameters, please refer to Appendix: Netlogo guide.

As a general approach, it proposes that there is a universe of possible languages (discrete or continuous), \(h \in U,\) and that an agent maintains at all times a probability distribution over all these possible languages. Initially, before seeing any linguistic data, the agent has a prior distribution over these possible languages, \(p(h)\), and, as new data (in the form of observed utterances), \(d = \{u_{1}, u_{2}, … u_{n}\}\), come in, this probability is updated following Bayes’ rule, resulting in the posterior distribution: \[p(h|d) = \frac{p(d|h).p(h)}{p(d)}\] that reflects the new representation that the agent has of the probability of each possible language \(h \in U\) after having heard the utterances composing the data \(d\). In this, \(p(d|h)\) is the likelihood that the observed data \(d\) was generated by language \(h\), and \(p(d)\) is a normalisation factor ensuring that \(p(h|d)\) is a probability bounded by 0.0 and 1.0.

In this paper, we model a single binary feature and consequently the utterances, \(u\), collapse to a single bit of information, “0” or “1”. The observed data, \(d\), become binary strings, and one of the simplest models of language is that of throwing a (potentially unfair) coin that returns, with probability \(h \in [0,1]\), a “1” (otherwise, with probability \(1-h\), a “0”). Thus, the universe of our languages, \(h\), is the real number interval \(U = [0,1] \subset {\rm I\!R}\), and the likelihood of observing an utterance \(u \in \{ 0, 1 \}\) is given by the Bernoulli distribution with parameter \(h\); for a set of utterances \(d = \{u_{1}, u_{2}, … u_{n}\}\), the likelihood is given by the binomial distribution with parameters \(k = |\{u_{i}=1\}_{i=1..n}|\) (the number of utterances “1”), \(n\) (the total number of utterances), and \(h: p(d|h) = Binomial(k,n,h) = \frac{n!}{k!(n-k)!}h^{k}(1-h)^{n-k}\), where \(x! = 1 \cdot 2 \cdot ... \cdot (x-1) \cdot x\); thus, we can reduce the set of utterances forming the data \(d\), without any loss of information, to the number of “1” utterances (\(k\)) and the total number of utterances (\(n\)). In Bayesian inference we sometimes use the conjugate prior of a given likelihood, in this case, the Beta distribution defined by two shape parameters, \(\alpha\) and \(\beta\), with probability density \(f(x,\alpha,\beta) = \frac{1}{B(\alpha,\beta)}x^{\alpha-1}(1-x)^{\beta-1}\), where \(B(\alpha,\beta)\) normalizes the density between 0.0 and 1.0. With these, the prior distribution of language \(h\) is \(f(h,\alpha_{0},\beta_{0})\), with parameters \(\alpha_{0}\) and \(\beta_{0}\) defining the shape of this distribution (see below), and the posterior distribution, updated after seeing the data \(d=(k,n)\), is \(p(h|d) = f(h,\alpha_{1},\beta_{1})\), where \(\alpha_{1} = \alpha_{0} + k\) and \(\beta_{1} = \beta_{0} + (n-k)\); thus, the posterior distribution is also distributed Beta, with the shape parameter \(\alpha\) “keeping track” of the “1” utterances, and \(\beta\) of the “0” utterances, and the Bayesian updating is reduced to simple (and very fast) arithmetic operations. When it comes to utterance production, a SAM agent chooses a value \(h \in [0,1]\) from the \(B(\alpha_{1},\beta_{1})\) distribution (i.e., proportional to \(f(h,\alpha_{1},\beta_{1})\))), while a MAP picks the mode of the distribution, \(h_{M} = \frac{\alpha_{1}-1}{\alpha_{1}+\beta_{1}-2}\); afterward, the agent uses this number between 0.0 and 1.0 as the parameter of a Bernoulli distribution (a coin throw) to extract a single “0” or “1” value with this probability – this value then is the utterance that the agent produces.

This choice (Bernoulli/Beta) not necessarily reflects how data is used by real humans in learning a language, but it has several major advantages, most notably its simplicity, transparency, and computational efficiency making it possible to run very large simulations on a consumer-grade computer in reasonable time (Dediu, 2009). Probably the most relevant here concerns the fact that the bias can be modeled only through the shape parameters of the prior Beta distribution, \(\alpha_{0}\) and \(\beta_{0}\), as the likelihood function is fixed to the Binomial, and the utterance produce offers only a limited choice between SAM and MAP. However, the Beta distribution is notoriously flexible, and can be used to represent from (almost) flat (or uninformative) distributions, to extremely peaked and to “U”-shaped ones. Moreover, for unimodal cases, we can model not only the location of the peak (i.e., the “preferred” value), but also the spread of this peak (i.e., how “strong” is this preference, operationally, how much data is needed to change the preferred value); we actually describe the Beta distribution using these alternative parameters, the mode \(\mu\) (describing the “preferred location”) and the “spread” \(\lambda\), which are linked to the shape parameters \(\alpha\) and \(\beta\) (see Box 1). Thus, arguably, the Beta distribution is flexible enough to model relatively well an intuitive view of how such a bias might look like – not just a preferred value but also a strength of this preference. See Figure 1 for an example of how different prior distributions are updated upon seeing some data.

You can see more details about this process in Strength and location of the bias.

Independent variables

In our Netlogo model, we used the following variables:

Parameters Variable name Dependencies Comments
Network size size_net none The number of agents (i.e., agents); it is fixed for a given run
Frequency of biased agents prop_biased none The proportion of agents in the network that are biased; please note that here we consider networks containing a single type of biased agents
Bias location and strength bias_strength none Only the strength value of biased agents varies ; the value for unbiased agent is fixed and set to \(\mu_{0} = 0.5\), \(\lambda_{0} = 0.9\). See more information in Strength and location of the bias
Proportion of highest centrality agents that are biased influencers_biased depends on prop_biased More information in Influencers biased
Utterance production mechanism learners none More information in Learners
Network type network none More information in Network type
Initial language init_lang none The total number of utterances (n0) and the number of utterances “1” (k0) presented to all the agents in the network in the initial iteration i = 0. More information in Initial language

Set of combination 1 - analysis.csv

The parameters we used in the set of combination 1 are the following:

Variable Values
size_net 10 (“tiny”)
50 (“small”)
150 (“medium”)
500 (“large”)
1000 (“very large”)
prop_biased 0% (“fully unbiased”)
10%
30%
50%
100% (“fully biased”)
bias_strength \(\mu_{0} = 0.1,\) \(\lambda_{0} = 0.6\) (“biased flexible”)
\(\mu_{0} = 0.1\), \(\lambda_{0} = 0.1\) (“biased rigid”)
influencers_biased 0% (“Random”)
10% (“biased influences”)
learners SAM (“sampler”)
MAP (“a posteriori maximizer”)
network Random
Scale-free
Small-world
init_lang k0 = 0, n0 = 0 (“no initial language”)
k0 = 4, n0 = 4 (“initial language”)

Set of combination 2 - extra_analysis.csv

The parameters we used in the set of combination 2 are the following:

Variable Values
size_net 150 (“medium”)
prop_biased 0 to 100%, in steps of 1%
bias_strength \(\mu_{0}\) = 0.1 (biased), \(\lambda_{0}\) = 0.01 to 0.99, in steps of 0.01
influencers_biased 0% (“Random”)
50% (“biased influences”)
100% (“biased extremely influent”)
learners SAM (“sampler”)
network Random
Scale-free
Small-world
init_lang k0 = 4, n0 = 4 (“initial language”)

Strength and location of the bias

The first internal representation of the language (at \(t = 0\)) is represented by a Beta distribution (alpha, beta). However, the Beta distribution is notoriously flexible, and can be used to represent from (almost) flat (or uninformative) distributions, to extremely peaked and to “U”-shaped ones. For unimodal cases, we can model:

  • \(\mu_{0}\): location of the bias (or mode), i.e., the “preferred” value. The higher, the more likely the individual will produce utterances = 1
  • \(\lambda_{0}\): the spread of the peak (or strength of the bias), i.e., how “strong” is this preference, operationally, how much data is needed to change the preferred value. The higher, the less strongly biased.

We actually describe the Beta distribution using these alternative parameters, the mode \(\mu_{0}\) and the “spread” \(\lambda_{0}\), which are linked to the shape parameters (alpha, beta) using a small algorithm:

  • the user can select the mode of the Beta distribution (= the location of the bias), and the learning acceptance (= how strong the bias is);
  • the program computes the lower and upper uncertainty limits from the given mode and learning acceptance, such that these limits are within the interval [0, 1];
  • the values for the mode and the upper and lower limits are passed to the betaExpert from from the prevalence package, which computes the unique values of alpha and beta; for optimization and future-proofing reasons, we precomputed and hard-coded the alpha and beta values used in this paper within our NetLogo script (available in the GitHub repository mathjoss/bayes-in-network).

(see hidden code below for more information)

In order to run faster simulation, and to prevent compatibility problems in Netlogo, we saved these alpha and beta values directly inside the Netlogo code. For example, this algorithm creates the following alpha and beta values according to \(\mu_{0}\) and \(\lambda_{0}\):

(\(\mu_{0}\), \(\lambda_{0}\)) (alpha, beta) Parameter
(0.1, 0.1) (10.96, 90.62) (“biased rigid”)
(0.1, 0.6) (1.58, 6.19) (“biased flexible”)
(0.5, 0.9) (2.2, 2.2) (“unbiased”)
They can be visually represented like this:
**Figure 1.** Visualization of Beta distributions.

Figure 1. Visualization of Beta distributions.

Hearing utterances will gradually change the internal representation of the language for each agent, whatever their starting distribution.

For example, after hearing 10 and 20 utterances = 1, the internal representation of the language will be like:

**Figure 2.** The evolution of some examples of Beta priors (thick solid curves) after seeing some data (utterances), to become successive Beta posterior distributions (thin curves). Blue: an individual strongly biased against the feature; red: an individual weakly biased against the feature; and black: an unbiased individual. Top row: the prior distributions before seeing any data (“at birth”)’ middle row: the Beta distributions updated after seeing n=10 utterances all containing the value “1”; bottom row: after seeing n=20 such utterances.

Figure 2. The evolution of some examples of Beta priors (thick solid curves) after seeing some data (utterances), to become successive Beta posterior distributions (thin curves). Blue: an individual strongly biased against the feature; red: an individual weakly biased against the feature; and black: an unbiased individual. Top row: the prior distributions before seeing any data (“at birth”)’ middle row: the Beta distributions updated after seeing n=10 utterances all containing the value “1”; bottom row: after seeing n=20 such utterances.

Initial language

The initial language parameter corresponds to two situations:

  • on the one hand, it can model the (quite unrealistic) case where agents are born in a society without any pre-existing language or where they are not exposed to any linguistic input (k0 = 0, n0 = 0), so that the agents must create their first utterances based only on their prior bias (init_lang = 0 or init_lang = no in some plots).

  • on the other hand, it can model the more common case where agents are born in a society with a pre-existing language already biased towards the use of the feature (k0 = 4, n0 = 4); this is modelled by presenting all the agents with the same 4 utterances “1” in the initial iteration, so that the first utterances generated by the agents are based both on on their prior bias and the linguistic input from the society. In this analysis, the variant supported by agents having a bias (both strong or weak) is always the utterance “0” (init_lang = 4 or init_lang = yes in some plots).

Here is a visualization of the Beta distribution curve of biased and unbiased agents, for both conditions of the initial language of the society:

**Figure 3.** With or without an initial language: these show the Beta distributions of the agents in the case where no initial language exists in the society (bottom row) and when such an initial language (mildly biased toward “1”) does exist (top row). The colors of the curves represent the three types of agents in our simulation (unbiased, and weakly and strongly biased; see also Figure 1).

Figure 3. With or without an initial language: these show the Beta distributions of the agents in the case where no initial language exists in the society (bottom row) and when such an initial language (mildly biased toward “1”) does exist (top row). The colors of the curves represent the three types of agents in our simulation (unbiased, and weakly and strongly biased; see also Figure 1).

As the condition with an initial language value of the society is more realistic, we will use this one preferentially in the computations.

Learners

There are two widely-used strategies to produce utterances (among, the many possible ones; (Kirby et al., 2007)):

  • sampler strategy (SAM): a language h can be sampled at Random from the universe of possible languages proportional to its probability in the posterior distribution \(p(h|d)\).

  • maximum a posteriori strategy (MAP): we can pick the language \(h_M\) that has the maximum posterior probability \(max_{h \epsilon U}(p(h|d))\)

Time

The network we used is synchronous: the language values of all agents are updated simultaneously at the end of each iteration, after all agents have talked once. More precisely, in a given iteration, each agent is selected in turn in a random order and is allowed to produce one utterance (“speak”), utterance which is “heard” by all its network neighbours. However, the agents do not update their internal representation of language until all have “spoken” (i.e., at the end of the iteration). In this way, each agent has the chance to “speak” and it does so using its representation of language from the previous iteration (if any), unaffected by any utterance they might have “heard” during the current iteration.

Each round, each individual says one utterance, and listen to the utterance(s) of his neighbor(s). The language value of all individuals are updated at the same time, when all individuals have talked once.

In this analysis, we call each round a tick.

We study the evolution of language on a period of time containing 1000 ticks for the main analysis, and 500 ticks for the systematic bias effect study.

Network type

The network represents the socio-linguistic structure of a community, and constrains the linguistic interactions between agents. The agents are the network’s agents, and if there is an edge between two agents then those two agents will engage in linguistic interactions.

Please note that we consider here only static networks: there is no change, during a run, in the number of agents, the properties of the agents (bias and utterance production mechanism and in the topology of the network (i.e., the pattern of edges connecting the agents).

Likewise, our model does not include directed nor weighted edges (i.e., the two connected agents can interact symmetrically, and there is no way to specify that two agents might interact “more” than others), but we do think that dynamic weighted directed networks are an important avenue to explore in the future.

In this analysis, we use 3 different types of networks, always generated randomly in Netlogo: Scale-free, Random, and Small-world networks.

Scale-free networks

Algorithm

We use the preferential attachment algorithm (Barabási, Albert, & Jeong, 2000). It starts from a seed of agents and gradually adds new ones; new links are created between the newly-added agents and the pre-existing agents following the rule that the more a agent is connected, the greater its chance to receive new connections. Formally, the probability \(p_i\) that a new agent is connected to agent \(i\) is \(p_i= \frac{k_i}{\sum_{j}k_j}\), where \(k_i\) is the degree of agent \(i\), and the sum is over all pre-existing agents \(j\).

Characteristics

Scale-free networks exhibit a power law degree distribution: very few agents have a lot of connections, while a lot have a limited number of links. These type of networks are found, for example, on the Internet (Albert, Jeong, & Barabási, 1999) or in cell biology (Albert, 2005).

Small-world networks

Algorithm

We use a classic beta model of the Watts-Strogatz algorithm (Watts & Strogatz, 1998). The algorithm first creates a ring of agents, where each agent is connected to a number \(N\) of neighbours on either side, and then rewired with a chosen probability \(p\).

In this model, we always use the value \(N = 4\) and \(p=0.1\).

Characteristics

This process leads to the creation of hubs and the emergence of short average path lengths. Small-world properties were popularized by (Milgram, 1967)’s “Six degrees of separation” idea, and are found in many real-world phenomena, such as social influence networks (Kitsak et al., 2010) and semantic networks (Kenett et al., 2018).

Random networks

Algorithm

We use Erdos & Renyi popular algorithm (Erdős & Rényi, 1959). We specify the number of agents and the overall connectivity of the graph giving the probability of adding an edge between any two agents (\(p\)); in this model, we always use \(p=0.1\).

Characteristics

It is an unrealistic baseline model, which does not represent the structure of real-world networks.

Visualization

Network type

Influencers biased

We are interested to know what happens if the most influential people in a network are biased. To investigate it, we created a variable influencers_biased: it corresponds to the percentage of highest centrality agents that are biased.

As an example, if there are 20% of biased agents in the network, and 15% of influencers biased, it means that the 15% most influential agents will be biased, and 5% of the rest of the network will be randomly biased. Practically speaking:

  • if (prop_biased) \(\geq\) (influencers_biased), the (influencers_biased) most popular agents are biased, the rest of biased agents being randomly chosen in the population
  • if (prop_biased) \(<\) (influencers_biased), then (prop_biased) most popular agents are biased.

In the main analysis, we selected only 2 values for influencers_biased :

  • 0% of influencers biased, so the biased agents in the population are selected randomly;
  • 10% of influencers biased, so the 10% most influential agents are biased (if prop_biased >= 10, otherwise the prop_biased most influential agents are biased)

In the systematic bias effect study, we selected 3 values for influencers_biased :

  • 0% of influencers biased, so the biased agents in the population are selected randomly;
  • 50% of influencers biased, so the 50% most influential agents are biased (if prop_biased >= 50, otherwise the prop_biased most influential agents are biased)
  • 100% of influencers biased, so the 100% most influential agents are biased (if prop_biased = 100, otherwise the prop_biased most influential agents are biased)

Here, most influential agents are determined using measures for eigenvector centrality (see more here).

Dependent variables

We measured different variables using Netlogo BehaviorSpace tool.

Here is the exhaustive list of all variables that we used in this analysis:

Language value

The language value of an agent at a given moment varies between 0 and 1, and is the mode of the Beta distribution representing the internal belief of the agent concerning the distribution of the probability of utterances “1” in the language. Biased agents typically start with a lower la than the unbiased agents, thus favoring the variant “0”. We also define the language value of a given group of agents (for example, a community or the whole network) as the mean of the language values of all the agents in the group. We decided to focus on the language value observed after 1,000 iterations, because the language value was always stabilized after this period.

Inter-individual variation across the agents in a given network is an important outcome: we found that most biased and unbiased agents have very similar behaviors within their respective groups, justifying the use of the mean language values of the biased (langval_biased) and the unbiased agents (langval_control). We also computed the mean language value of the whole population (langval_all): even if there may be variation between groups (the biased vs the unbiased agents) and between agents, this value is a global indicator of the average language used in the population.

To summarize, we use the 3 following variables:

  • mean language value of all agents (langval_all) at final tick
  • mean language value of biased agents (langval_biased) at final tick
  • mean language value of unbiased agents (langval_control) at final tick

These variables were recorded directly inside BehaviorSpace, Netlogo.

Difference between unbiased and biased agents

Here, we used the signed difference between the mean language values of the unbiased agents and the mean language values of biased agents, as this gives very similar results to the much more computationally expensive method of computing all pairwise differences between all unbiased and biased agents:

  • difference of mean language value of unbiased agents and mean language value of biased agents (diff_group)

We computed this variable from the previous language value means, in R.

Stabilization time

Intuitively, stabilisation time captures how long (in terms of interaction cycles) it takes for the language of a given network to reach a stable state. Given the inhomogeneous nature of the network, we consider two measures: the moment when the language value of the whole population stabilize (stab_all), the moment when the language value of the biased agents stabilizes (stab_biased) and the moment when the language value of the unbiased agents stabilizes (stab_control); these measures are estimated using the language values of their respective populations. To summarize, we use:

  • moment when the language value of the society stabilizes (stab_all)
  • moment when the language value of the biased agents stabilizes (stab_biased)
  • moment when the language value of the unbiased agents stabilizes (stab_control)

Please note that the measure stab_all is the less accurate and representative of the actual behaviour of agents. Consequently, we decided to mainly study the results of stab_biased and stab_control.

These variables were computed using the mean language values above, on R. We used 2 different algorithms to compute the stabilization time (method 1 and method 2). After checking the results, we found method 2 to be more accurate, so this Rmarkdown only shows the results of the analysis using method 2.

Method 1

We used a discrete sliding window in which we estimate the derivative (i.e., change) and we recorded this change. After the window slided along the whole period of time, we selected the 15 values closest to 0. The value we chose as the stabilization time was the earliest value among these 15 values.

This method is based on the method used by (Jannsen, 2018) (p. 79). Pratically speaking:

The maximum number of ticks of our model is \(nIterations = 1000\), and the size of the sliding window is \(w = nIterations/10\). We applied a loess function on the language values in each window, which is a local polynomial regression fitting (see more here). Then, we ran the following equation on the predicted point (regression line):

\[t(e_g) = \frac{e_{g+w} - e_g}{w}\]

and we obtain a sequence of elite fitness scores \(\vec{e}= (t(e_1), t(e_2), ...)\). The algorithm terminates at the end, so \(length(\vec{e}) = nIterations - w\). Then, we selected the 15 values closest to 0 in \(\vec{e}\). Among these values, we selected the value \(t(e_g)\) with the minimum \(g\).

Method 2

The estimation is based on the method developed in Jannsen (2018:79) and used a fixed-size sliding window within which we estimate the change in the language value, we multiply this number by 100, round it, and stop if this number is equal to zero (i.e., the slope is within \(\pm\) 0.001 of 0.0) for 50 consecutive steps. Practically speaking, the maximum number of ticks of our model is \(nIterations = 1,000\), and the size of the sliding window is \(\omega= nIterations/10\). For a given window, we estimated the change, \(t(e_{g})\) using the following formula:

\[t(e_g) = \frac{e_{g+w} - e_g}{w}*100\]

On the rounded \(t(e_{g})\) values, we find the first value of \(g\), \(g_{stabilization}\), when the rounded value of \(t(e_{g})=0\), and we stop if for 50 consecutive steps (i.e., \(g \in [g_{stabilization}.. (g_{stabilization}+50)]\)), there is no change, \(t(e_{g})=0\); in this case, the stabilization time is the first moment where there was no change, namely \(g_{stabilization}\).

Let us visualize where is the stabilization point found for an example.

**Figure 4.** Stabilization times for the biased and the unbiased agents. This example uses a scale-free network with 500 agents, with SAM agents, where 10% of the top influencers are strongly biased, in the presence of an initial language.

Figure 4. Stabilization times for the biased and the unbiased agents. This example uses a scale-free network with 500 agents, with SAM agents, where 10% of the top influencers are strongly biased, in the presence of an initial language.

Dissemination

First, the inter-replication variation is estimated by computing the standard deviation of the language values obtained among the R replications after 1,000 iterations. It captures the influence of various sources of Randomness on each particular run of a given condition, and we computed it for 3 different groups:

  • dissemination of the results of different replications for all agents (diss_all)
  • dissemination of the results of different replications for biased agents only (diss_biased)
  • dissemination of the results of different replications for unbiased agents agents (diss_unbiased)

These variables were computed using the mean language values above, on R.

The results are recorded inside a new table, which gather the values of dissemination for each combination of conditions: data_dissemination.

Community detection

In order to study the possible differences in the language values of the agents belonging to different communities, we first detect the structural communities within the network using the Louvain community detection (see more info here) algorithm (as implemented in NetLogo’s nw extension package), which detects communities by maximizing modularity based on the connections agents share with each other, and not on the agents’ language values.

Community detection

Since the network is static, we then use the detected communities to compute the language value of each community for each iteration:

  • mean language value of each community (\(list mean\)) ;
  • std language value of each community (\(list std\)) ;
  • number of agents in each community (\(list nb node\)).

The results are a list of data, the size depending on the number of communities detected by Louvain algorithm. Then, on R, we extracted 2 values from these 3 lists:

  • hetero_inter_group: heterogeneity between communities, computed with \(sd(list mean)\)

Interpretation: A low number indicates that all communities have approximately the same language value, whereas a high number indicates that the communities inside the network have different language value.

  • hetero_intra_group: heterogeneity within communities, computed with \(mean(list sd)\)

Interpretation: A low number indicates that people share the same language value inside each community, whereas a high number indicates that people can have pretty different language value inside each community.

Cleaning data

The computations used are available in Netlogo’s BehaviorSpace tool. Once we got the resulting analysis.csv file, we cleaned it:

  1. change variables’ name
  2. convert missing data to NaN (see Missing data)
  3. compute the stabilization time for all, biased and unbiased agents using the language values for all ticks according the algorithm presented in Stabilization time
  4. reduce the dataset’s size by keeping only necessary information:
    • keep only the language value and the communities value at tick 0, 1 and 1000
    • change the format so that they appear in different columns

Consequently, the following variables are only recorded at tick 0, 1 and 1000 in the analysis.csv file:

  • Language values (for all, biased and unbiased agents)
  • Communities (mean, std and number of agents)

And we recorded the following variables only at tick 1000:

  • Difference between unbiased and biased agents
  • Heterogeneity between groups
  • Heterogeneity within groups

Note: as these values were extracted from the previous language values, they could be very easily computed for tick 0 and 1 if needed.

Missing data

For logical reasons, the following categories contain missing data:

  • When there are 0% of biased agents, the language value of biased agents, the stabilization time of biased agents, the difference between biased and unbiased agents are missing data.
  • When there are 0% of unbiased agents, the language value of unbiased agents, the stabilization time of unbiased agents, the difference between biased and unbiased agents are missing data.
  • communities were not studied in networks with 10 agents, because it does not contain enough agents to perform an interesting community detection algorithm. This is true especially for Random networks, because a lot of agents are isolated in a 10-agents network; thus, this category was not computed.

Summary

example_time.csv

Here is a summary of our dataset example_time.csv:

Please note that in order to save some time to compute the replications, we computed the values for communities only for the tick 0, 1 and 1000.

As a quick reminder, this file only contains a subset of replications in order to show an example. It does not contain all the possible combinations of our independent variables. Go to Dataset for more information.

Table continues below
rep_id prop_biased learners bias_strength size_net
Min. :5401 10:400400 bayesianSAM:400400 0.1:200200 500:400400
1st Qu.:5501 NA NA 0.6:200200 NA
Median :6500 NA NA NA NA
Mean :6500 NA NA NA NA
3rd Qu.:7500 NA NA NA NA
Max. :7600 NA NA NA NA
Table continues below
init_langval influencers_biased ticks langval_all
Min. :4 0 :200200 Min. : 0 Min. :0.5338
1st Qu.:4 10:200200 1st Qu.: 250 1st Qu.:0.6047
Median :4 NA Median : 500 Median :0.6633
Mean :4 NA Mean : 500 Mean :0.6561
3rd Qu.:4 NA 3rd Qu.: 750 3rd Qu.:0.7077
Max. :4 NA Max. :1000 Max. :0.7879
Table continues below
langval_biased langval_control communities_mean communities_std
Min. :0.1348 Min. :0.5602 Length:400400 Length:400400
1st Qu.:0.3330 1st Qu.:0.6303 Class :character Class :character
Median :0.4948 Median :0.6812 Mode :character Mode :character
Mean :0.4929 Mean :0.6743 NA NA
3rd Qu.:0.6486 3rd Qu.:0.7160 NA NA
Max. :0.7474 Max. :0.8125 NA NA
communities_nbnodes network init_lang
Length:400400 scalefree:400400 4:400400
Class :character NA NA
Mode :character NA NA
NA NA NA
NA NA NA
NA NA NA

analysis.csv

Here is a summary of our dataset analysis.csv:

Please note that the last number (in our dependent variables) indicates the tick: for example, langval_control_0 indicates the language value recorded at tick 0 for unbiased agents, while langval_control_1000 indicates the language value recorded at tick 1000 for unbiased agents, etc.

It is the file that will be mainly used during our analysis. Go to Dataset and Set of combination 1 - analysis.csv for more information.

Table continues below
rep_id prop_biased learners bias_strength size_net
Min. : 1 0 :24000 MAP:60000 0.1:60000 10 :24000
1st Qu.: 3001 10 :24000 SAM:60000 0.6:60000 50 :24000
Median : 6500 30 :24000 NA NA 150 :24000
Mean : 7867 50 :24000 NA NA 500 :24000
3rd Qu.:12500 100:24000 NA NA 1000:24000
Max. :20000 NA NA NA NA
NA NA NA NA NA
Table continues below
init_lang influencers_biased network langval_all_0
0:60000 0 :60000 Random :40000 Min. :0.1000
4:60000 10:60000 Scale-free :40000 1st Qu.:0.3600
NA NA Small-world:40000 Median :0.4712
NA NA NA Mean :0.4832
NA NA NA 3rd Qu.:0.6578
NA NA NA Max. :0.8125
NA NA NA NA
Table continues below
langval_control_0 langval_biased_0 communities_mean_0 communities_std_0
Min. :0.500 Min. :0.100 Length:120000 Length:120000
1st Qu.:0.500 1st Qu.:0.100 Class :character Class :character
Median :0.656 Median :0.117 Mode :character Mode :character
Mean :0.656 Mean :0.201 NA NA
3rd Qu.:0.813 3rd Qu.:0.218 NA NA
Max. :0.813 Max. :0.469 NA NA
NA’s :24000 NA’s :24000 NA NA
Table continues below
communities_nbnodes_0 langval_all_1 langval_control_1 langval_biased_1
Length:120000 Min. :0.0000 Min. :0.000 Min. :0.000
Class :character 1st Qu.:0.2746 1st Qu.:0.428 1st Qu.:0.120
Mode :character Median :0.4526 Median :0.567 Median :0.157
NA Mean :0.4535 Mean :0.578 Mean :0.251
NA 3rd Qu.:0.6468 3rd Qu.:0.763 3rd Qu.:0.432
NA Max. :0.9339 Max. :0.934 Max. :0.865
NA NA NA’s :24000 NA’s :24000
Table continues below
communities_mean_1 communities_std_1 communities_nbnodes_1
Length:120000 Length:120000 Length:120000
Class :character Class :character Class :character
Mode :character Mode :character Mode :character
NA NA NA
NA NA NA
NA NA NA
NA NA NA
Table continues below
langval_all_1000 langval_control_1000 langval_biased_1000
Min. :0.0000 Min. :0.000 Min. :0.000
1st Qu.:0.2422 1st Qu.:0.325 1st Qu.:0.179
Median :0.4216 Median :0.475 Median :0.316
Mean :0.4293 Mean :0.492 Mean :0.350
3rd Qu.:0.6162 3rd Qu.:0.669 3rd Qu.:0.493
Max. :0.9979 Max. :0.998 Max. :0.981
NA NA’s :24000 NA’s :24000
Table continues below
communities_mean_1000 communities_std_1000 communities_nbnodes_1000
Length:120000 Length:120000 Length:120000
Class :character Class :character Class :character
Mode :character Mode :character Mode :character
NA NA NA
NA NA NA
NA NA NA
NA NA NA
Table continues below
stab_all stab_control stab_biased diff_group
Min. : 1.00 Min. : 1.0 Min. : 1.0 Min. :-0.44
1st Qu.: 13.00 1st Qu.: 31.0 1st Qu.: 34.0 1st Qu.: 0.00
Median : 50.00 Median : 84.0 Median :108.0 Median : 0.02
Mean : 82.11 Mean :113.4 Mean :151.3 Mean : 0.05
3rd Qu.:118.00 3rd Qu.:164.0 3rd Qu.:224.0 3rd Qu.: 0.06
Max. :849.00 Max. :846.0 Max. :850.0 Max. : 0.79
NA NA’s :24000 NA’s :24000 NA’s :48000
hetero_inter_group hetero_intra_group
Min. :0.000 Min. :0.000
1st Qu.:0.002 1st Qu.:0.004
Median :0.033 Median :0.016
Mean :0.040 Mean :0.023
3rd Qu.:0.064 3rd Qu.:0.037
Max. :0.325 Max. :0.134
NA’s :8154 NA’s :8000

You can notice some NaN values in heterogeneity measures: this is when the network is very small (10 agents) and only one community has been detected by Louvain algorithm. As we will not study 10-agents network in the heterogeneity part, we can ignore these missing data.

extra_analysis.csv

Here is a summary of our dataset extra_analysis.csv:

In order to save some time to compute the replications, we recorded here only the final language value for all agents at tick 500.

The size of the network is 150 agents, the learners are SAM sampler, and there is an initial value of the language of the society. Go to Dataset and Set of combination 2 - extra_analysis.csv for more information.

Table continues below
rep_id prop_biased bias_strength influencers_biased
Min. : 1 Min. : 0 Min. :0.01 0 :1499850
1st Qu.: 281222 1st Qu.: 25 1st Qu.:0.25 50 :1499850
Median : 649936 Median : 50 Median :0.50 100:1499850
Mean : 669934 Mean : 50 Mean :0.50 NA
3rd Qu.:1024898 3rd Qu.: 75 3rd Qu.:0.75 NA
Max. :1499850 Max. :100 Max. :0.99 NA
ticks langval_all network
Min. :500 Min. :0.09326 random :1499850
1st Qu.:500 1st Qu.:0.41813 scalefree :1499850
Median :500 Median :0.58055 smallworld:1499850
Mean :500 Mean :0.53937 NA
3rd Qu.:500 3rd Qu.:0.68430 NA
Max. :500 Max. :0.94574 NA

Example of language change through time for specific conditions

Here, we visualize what happens through time for biased and unbiased agents, in 4 different conditions, using example_time.csv:

  • strong bias and 0% of influencers;
  • weak bias and 0% of influencers;
  • strong bias and 10% of influencers;
  • weak bias and 10% of influencers.
**Figure 5.** Language (vertical axis, as language values) is changing across time (horizontal axis, in ticks) in a scale-free network with 500 SAM agents of which 10% are biased. Each individual curve represents the mean language value of the biased minority (blue) and the unbiased majority (light green) for 100 independent replications. The black curve shows the aggregated mean of the different replications. Top: the minority is strongly biased; bottom: the minority is weakly biased. Left: the biased minority is not overrepresented among the most influential agents in the network; right: the 10% most influential agents are occupied by biased agents.

Figure 5. Language (vertical axis, as language values) is changing across time (horizontal axis, in ticks) in a scale-free network with 500 SAM agents of which 10% are biased. Each individual curve represents the mean language value of the biased minority (blue) and the unbiased majority (light green) for 100 independent replications. The black curve shows the aggregated mean of the different replications. Top: the minority is strongly biased; bottom: the minority is weakly biased. Left: the biased minority is not overrepresented among the most influential agents in the network; right: the 10% most influential agents are occupied by biased agents.

Analysis

In this study, we focus on all our dependent variables, using the file analysis.csv:

  1. Language value (after 1000 ticks) for biased, unbiased and all agents
  2. Difference between unbiased and biased agents (after 1000 ticks)
  3. Stabilization time for biased, unbiased and all agents
  4. Dissemination for biased, unbiased and all agents
  5. Heterogeneity intra and inter group (after 1000 ticks)

Final value of language

Regression

We apply a classic linear regression model to our data using the function lm. We study :

  1. the final value of the language for all agents at tick = 1000 ;
  2. the final value of the language for unbiased agents at tick = 1000 ;
  3. the final value of the language for biased agents at tick = 1000.

1) For the final value of all agents :


Call:
lm(formula = langval_all_1000 ~ prop_biased + bias_strength + 
    size_net + learners + network + influencers_biased + init_lang, 
    data = subdata_lm)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.52268 -0.06905 -0.00292  0.07171  0.60128 

Coefficients:
                     Estimate Std. Error  t value Pr(>|t|)    
(Intercept)         0.4223141  0.0005855  721.348   <2e-16 ***
prop_biased        -0.1424344  0.0002927 -486.578   <2e-16 ***
bias_strength       0.0691894  0.0002927  236.362   <2e-16 ***
size_net            0.0043384  0.0002927   14.821   <2e-16 ***
learnersSAM        -0.0008647  0.0005855   -1.477     0.14    
networkScale-free   0.0096548  0.0007170   13.465   <2e-16 ***
networkSmall-world  0.0125769  0.0007170   17.540   <2e-16 ***
influencers_biased -0.0028282  0.0002927   -9.662   <2e-16 ***
init_lang           0.1204948  0.0002927  411.629   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.1014 on 119991 degrees of freedom
Multiple R-squared:  0.7941,    Adjusted R-squared:  0.7941 
F-statistic: 5.784e+04 on 8 and 119991 DF,  p-value: < 2.2e-16

Plot:

**Figure 6.** Effect of different variables on the final value of language (for all agents).

Figure 6. Effect of different variables on the final value of language (for all agents).

2) For the final value of unbiased agents :


Call:
lm(formula = langval_control_1000 ~ prop_biased + bias_strength + 
    size_net + learners + network + influencers_biased + init_lang, 
    data = subdata_lm)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.54033 -0.06036  0.00436  0.05930  0.59170 

Coefficients:
                     Estimate Std. Error  t value Pr(>|t|)    
(Intercept)         0.4030529  0.0006757  596.458  < 2e-16 ***
prop_biased        -0.1953250  0.0005782 -337.787  < 2e-16 ***
bias_strength       0.0597242  0.0003133  190.618  < 2e-16 ***
size_net            0.0039331  0.0003133   12.553  < 2e-16 ***
learnersSAM        -0.0092419  0.0006266  -14.748  < 2e-16 ***
networkScale-free   0.0149830  0.0007675   19.523  < 2e-16 ***
networkSmall-world  0.0099178  0.0007675   12.923  < 2e-16 ***
influencers_biased -0.0024605  0.0003133   -7.853 4.11e-15 ***
init_lang           0.1308255  0.0003133  417.547  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.09708 on 95991 degrees of freedom
  (24000 observations deleted due to missingness)
Multiple R-squared:  0.7723,    Adjusted R-squared:  0.7723 
F-statistic: 4.07e+04 on 8 and 95991 DF,  p-value: < 2.2e-16

Plot:

**Figure 7.** Effect of different variables on the final value of language (for unbiased agents).

Figure 7. Effect of different variables on the final value of language (for unbiased agents).

3) For the final value of biased agents :


Call:
lm(formula = langval_biased_1000 ~ prop_biased + bias_strength + 
    size_net + learners + network + influencers_biased + init_lang, 
    data = subdata_lm)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.45936 -0.06681  0.00512  0.06418  0.57070 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)         0.3690710  0.0006193  595.97   <2e-16 ***
prop_biased        -0.0999087  0.0003248 -307.58   <2e-16 ***
bias_strength       0.0972547  0.0003066  317.24   <2e-16 ***
size_net            0.0051074  0.0003066   16.66   <2e-16 ***
learnersSAM         0.0112994  0.0006131   18.43   <2e-16 ***
networkScale-free  -0.0093377  0.0007509  -12.44   <2e-16 ***
networkSmall-world  0.0165135  0.0007509   21.99   <2e-16 ***
influencers_biased -0.0094275  0.0003066  -30.75   <2e-16 ***
init_lang           0.1074381  0.0003066  350.46   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.09499 on 95991 degrees of freedom
  (24000 observations deleted due to missingness)
Multiple R-squared:  0.7697,    Adjusted R-squared:  0.7697 
F-statistic: 4.011e+04 on 8 and 95991 DF,  p-value: < 2.2e-16

Plot:

**Figure 8.** Effect of different variables on the final value of language (for biased agents).

Figure 8. Effect of different variables on the final value of language (for biased agents).

Conclusion:

All variables have a statistically significant effect on the language value. However, only prop_biased, bias_strength and init_lang have a big effect size. The variable network, when its value is Scale-free, might have a small effect on language value too. Finally, it seems that the effect of the variables size_net, learners and influencers_biased is negligible.

Plot specific variables

According to the results of the regression analysis, we have hints of what variables are interesting for our analysis. Then, we study specifically some combination of variables, without aggregating, in order to grasp the interesting content of the analysis.

Note: when it not specifically mentionned, the parameters we use are network = Scale-free, learners = SAM, size_net = 150, influencers_biased = 0, init_lang = 4.

With an initial value of the language in the society

**Figure 9.** Effect of size and network type in a network with an initial language in the society (SAM, no influencers). The top brown line indicate the initial value of the language for unbiased agents, while the dotted lines indicate the initial value of the language for biased agents (lower line: strongly biased, middle line: weakly biased).

Figure 9. Effect of size and network type in a network with an initial language in the society (SAM, no influencers). The top brown line indicate the initial value of the language for unbiased agents, while the dotted lines indicate the initial value of the language for biased agents (lower line: strongly biased, middle line: weakly biased).

Without an initial value of the language in the society

**Figure 10.** Effect of size and network type in a network without an initial language in the society (SAM, no influencers). The top brown line indicate the initial value of the language for unbiased agents, while the dotted lines indicate the initial value of the language for biased agents (lower line: strongly biased, middle line: weakly biased)

Figure 10. Effect of size and network type in a network without an initial language in the society (SAM, no influencers). The top brown line indicate the initial value of the language for unbiased agents, while the dotted lines indicate the initial value of the language for biased agents (lower line: strongly biased, middle line: weakly biased)

Main variables only

Here, we plot only the 3 variables that had an impact on the final value of the language, namely:

  • prop_biased
  • bias_strength
  • init_lang
**Figure 12.** The final language value of the whole population for a scale-free network with 150 SAM agents. The solid line (1) shows the initial value of the language for the unbiased agents, while the dotted lines (2a and 2b) show the initial value of the language for biased agents (a: weakly biased and b: strongly biased). The horizontal axis shows the different cases considered (combinations of bias strength and proportion of biased agents in the populations), the vertical axis is the language value of the population, and the colored boxplots show the distribution of the language values among the biased (purple) and unbiased (green) agents.

Figure 12. The final language value of the whole population for a scale-free network with 150 SAM agents. The solid line (1) shows the initial value of the language for the unbiased agents, while the dotted lines (2a and 2b) show the initial value of the language for biased agents (a: weakly biased and b: strongly biased). The horizontal axis shows the different cases considered (combinations of bias strength and proportion of biased agents in the populations), the vertical axis is the language value of the population, and the colored boxplots show the distribution of the language values among the biased (purple) and unbiased (green) agents.

**Figure 13.** The final language value of the whole population for a scale-free network with 150 SAM agents. The solid line (1) shows the initial value of the language for the unbiased agents, while the dotted lines (2a and 2b) show the initial value of the language for biased agents (a: weakly biased and b: strongly biased). The horizontal axis shows the different cases considered (combinations of bias strength and proportion of biased agents in the populations), the vertical axis is the language value of the population.

Figure 13. The final language value of the whole population for a scale-free network with 150 SAM agents. The solid line (1) shows the initial value of the language for the unbiased agents, while the dotted lines (2a and 2b) show the initial value of the language for biased agents (a: weakly biased and b: strongly biased). The horizontal axis shows the different cases considered (combinations of bias strength and proportion of biased agents in the populations), the vertical axis is the language value of the population.

Influencers effect

Is there an effect of influencers? In the following plot, we focus scale-free networks.

**Figure 14.** Effect of influencers in Scale-free networks (SAM, 150 agents, initial language).

Figure 14. Effect of influencers in Scale-free networks (SAM, 150 agents, initial language).

The percentage of influencers biased does not have a strong impact on our results. It has a small effect on the language value of the population if the network is very small, and if there are 10% of strongly biased agents in the network. We also observe a difference between biased and unbiased agents in big network when there are influencers biased; see Difference between unbiased and biased agents for more information.

Conclusion:

  • Logically, the more biased agents and the strongly biased they are, the lower the language value will be after 1000 ticks.
  • The initial society language also has a strong positive impact on the final value of the language of the society.
  • When only 10% of the population is strongly biased and when the network is very small, the presence of influencers will drag down the language value of the society in scale-free networkq.

Statistics

Hypothesis

Our hypothesis is that the presence of biased agents in the population have an impact on the language of the society. More specifically, it means that the language value of a population in which we introduced biased agents is significantly different from the language value of a population in which we have only unbiased agents.

Note: of course, this depends of the amount of biased agents introduced in the network. We used the value 10, 30 and 50% of biased agents to do this analysis. For a finer analysis, please refer to Systematic bias effect study.

Wilcoxon test

In order to test this hypothesis, we performed unpaired Wilcoxon (unpaired) tests for all possible combinations of parameters, comparing:

  • the language value of unbiased agents at tick 1000 in a population without biased agents
  • the language value of unbiased agents at tick 1000 in a population with biased agents (10%, 30% or 50%)

The goal is to check whether in some condition, these languages values are not statistically significant. We corrected the p-values for multiple testing using the Bonferroni method, and we print here all condition where the difference is not significant (p>0.05). The conditions are written the following way:

size_net - init_lang - influencers_biased - learners - network - prop_biased - bias_strength

 [1] " 500-0- 0-SAM-Random-10-0.6"      " 500-4-10-SAM-Random-10-0.6"     
 [3] "1000-0- 0-SAM-Random-10-0.6"      "1000-0-10-SAM-Random-10-0.6"     
 [5] "1000-4- 0-SAM-Random-10-0.6"      "  50-0- 0-SAM-Random-10-0.6"     
 [7] "  50-0-10-MAP-Random-10-0.6"      "  50-4- 0-SAM-Random-10-0.6"     
 [9] " 150-0- 0-MAP-Random-10-0.6"      " 150-0- 0-SAM-Random-10-0.6"     
[11] " 150-4-10-SAM-Random-10-0.6"      "  10-0- 0-MAP-Scale-free-10-0.1" 
[13] "  10-0- 0-MAP-Small-world-10-0.1" "  10-0- 0-MAP-Small-world-10-0.6"
[15] "  10-0- 0-SAM-Scale-free-10-0.6"  "  10-0- 0-SAM-Small-world-10-0.6"
[17] "  10-0-10-MAP-Scale-free-10-0.6"  "  10-0-10-MAP-Small-world-10-0.6"
[19] "  10-0-10-SAM-Small-world-10-0.6" "  10-4- 0-MAP-Scale-free-10-0.6" 
[21] "  10-4- 0-MAP-Small-world-10-0.6" "  10-4- 0-SAM-Scale-free-10-0.6" 
[23] "  10-4- 0-SAM-Small-world-10-0.6" "  10-4-10-MAP-Small-world-10-0.6"
[25] "  10-4-10-SAM-Scale-free-10-0.6"  "  10-4-10-SAM-Small-world-10-0.6"
[27] "  50-0- 0-MAP-Small-world-10-0.6" "  50-0- 0-SAM-Small-world-10-0.6"
[29] "  50-0-10-SAM-Small-world-10-0.6" "  50-4- 0-MAP-Small-world-10-0.6"
[31] "  50-4- 0-SAM-Small-world-10-0.6" "  50-4-10-SAM-Small-world-10-0.6"
[33] "  10-0-10-SAM-Small-world-30-0.6" "  10-0- 0-SAM-Random-10-0.1"     
[35] "  10-0-10-SAM-Random-10-0.1"      "  10-0- 0-SAM-Random-10-0.6"     
[37] "  10-0-10-SAM-Random-10-0.6"      "  10-4- 0-SAM-Random-10-0.6"     
[39] "  10-4-10-SAM-Random-10-0.6"      "  10-0- 0-MAP-Random-10-0.1"     
[41] "  10-0-10-MAP-Random-10-0.1"      "  10-0- 0-MAP-Random-10-0.6"     
[43] "  10-0-10-MAP-Random-10-0.6"      "  10-4- 0-MAP-Random-10-0.6"     
[45] "  10-4-10-MAP-Random-10-0.6"      "  10-0-10-SAM-Random-30-0.6"     
[47] "  10-4- 0-SAM-Random-30-0.6"      "  10-0- 0-MAP-Random-30-0.6"     
[49] "  10-0-10-MAP-Random-30-0.6"      "  10-0-10-SAM-Random-50-0.6"     

Conclusion on statistics

Except for 50 cases out of 720 cases, the language value of network with biased agents is always significantly different from the language value of network without biased agents.

These adjusted p-values show that, in the vast majority of the combinations, the language values of the unbiased agents in a society with biased agents are significantly different from those of an homogeneous unbiased population. Among the replications with no significant differences, most were networks with only 10 agents, and the remaining were random or small-world networks with a low percentage (10%) of weakly biased agents.

Main conclusion

  1. Indeed, the presence of biased agents impacts the language value of the society. See Systematic bias effect study to see in what extent it is true.
  2. The final value of the population language is shaped by proportion of biased agents, the strength of their bias and the initial language of the society.

Difference between biased and unbiased agents

Regression

We apply a classic linear regression model to our data using the function lm on the variable diff_group.


Call:
lm(formula = diff_group ~ prop_biased + bias_strength + size_net + 
    learners + network + influencers_biased + init_lang, data = subdata_lm)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.46814 -0.03973 -0.01194  0.01429  0.70570 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)         0.0394747  0.0006082  64.904   <2e-16 ***
prop_biased        -0.0362750  0.0006410 -56.589   <2e-16 ***
bias_strength      -0.0187004  0.0002954 -63.312   <2e-16 ***
size_net           -0.0024402  0.0002954  -8.262   <2e-16 ***
learnersSAM        -0.0056465  0.0005907  -9.558   <2e-16 ***
networkScale-free   0.0280823  0.0007235  38.814   <2e-16 ***
networkSmall-world -0.0136750  0.0007235 -18.901   <2e-16 ***
influencers_biased  0.0089878  0.0002954  30.429   <2e-16 ***
init_lang           0.0131163  0.0002954  44.406   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.07926 on 71991 degrees of freedom
  (48000 observations deleted due to missingness)
Multiple R-squared:  0.1602,    Adjusted R-squared:  0.1601 
F-statistic:  1716 on 8 and 71991 DF,  p-value: < 2.2e-16

Plot:

**Figure 15.** Effect of different variables on the difference between unbiased and biased agents.

Figure 15. Effect of different variables on the difference between unbiased and biased agents.

Conclusion on regression

All variables have a statistically significant effect on the language value. However, only network and prop_biased have a quite big effect size. The variable bias_strength might have a small effect on language value too. Finally, it seems that the effect of the variables size_net, init_lang, learners and influencers_biased is negligible.

Plot specific variables

Differences between network types

**Figure 16.** The difference between the languages of the unbiased and the biased agents after 1,000 iterations, function of network type (panels) and size (color), and bias frequency and strength (horizontal axis). We used SAM agents, there is no enrichment of biased agents among the top influencers, and agents were exposed to an initial language.

Figure 16. The difference between the languages of the unbiased and the biased agents after 1,000 iterations, function of network type (panels) and size (color), and bias frequency and strength (horizontal axis). We used SAM agents, there is no enrichment of biased agents among the top influencers, and agents were exposed to an initial language.

Influencers effect

**Figure 17.** Effect of influencers on the difference between unbiased and biased agents in scale-free networks (SAM, initial language).

Figure 17. Effect of influencers on the difference between unbiased and biased agents in scale-free networks (SAM, initial language).

Conclusion

After exploring a lot of different combinations, we found:

  • differences between network types :

    • network size. On the one hand, the differences between biased and unbiased agents become smaller and smaller with size in random networks (no differences in big networks!). On the other hand, in scale-free and small-world networks, the difference between the biased and unbiased increases with the network’s size.
    • influencers effect. The presence of biased influencers has an effect only in scale-free network, especially if prop_biased = 10% and the society is big.
  • interaction of prop_biased and bias_strength. In all networks, a low proportion of strongly biased agents amplify the differences between biased and unbiased agents.

Statistics

Hypothesis

Our hypothesis is that the biased agents keep a trace of their bias inside their everyday language, even after interacting with unbiased agents. More specifically, it means that after 1000 ticks, there would still be a difference in the language value of the biased and unbiased agents.

Wilcoxon test

For all sets of conditions (100 replications), we computed a Wilcoxon test between:

  • the language value of biased agents at tick = 1000;
  • the language value unbiased agents at tick = 1000.

Then, we adjusted the p-value using Bonferroni method.

Conclusion:

We found that this is not significant for 349 out of 720 of the conditions, when considering a corrected p-value = 0.05.

More specifically, we found that it is not significant for:

  • 90 cases out of 144 for networks with only 10 nodes
  • 84 cases out of 144 for networks with only 50 nodes
  • 69 cases out of 144 for networks with only 150 nodes
  • 58 cases out of 144 for networks with only 500 nodes
  • 48 cases out of 144 for networks with only 1000 nodes

Looking more precisely at the network type, we find that it is not significant for:

  • 190 cases out of 240 for random networks
  • 44 cases out of 240 for scale-free networks
  • 115 cases out of 240 for small-world networks

To conclude, the adjusted p-values are almost always significant for scale-free networks (except for small networks with 10 or 50 agents, often weakly biased); significant for half of the small-world networks, especially for big networks (with more than 150 agents) with strong biases; however, most random networks do not show a significant difference, with the exception of a few very small networks (10 or 50 agents).

Main conclusion

  • In scale-free networks (and small-world to a smaller extent), biased agents keep something of their bias in their everyday language, even after interacting with other agents.

  • The stronger the bias, the bigger the difference between the biased and unbiased agents at the end, which is expectable. But interestingly, the more biased agents, the less difference! 10% of strongly biased agents seems to be the condition where the difference is the highest.

  • Influencers have a strong effect on the difference between biased and unbiased agents, especially on scale-free networks: they increase this difference, especially in big networks with 10% of biased agents.

Stabilisation time

Regression

With stabilization for the whole population


Call:
lm(formula = stab_all ~ prop_biased + bias_strength + size_net + 
    learners + network + influencers_biased + init_lang, data = subdata_lm)

Residuals:
    Min      1Q  Median      3Q     Max 
-176.05  -48.18  -12.83   32.75  675.94 

Coefficients:
                   Estimate Std. Error  t value Pr(>|t|)    
(Intercept)         57.8443     0.4269  135.508   <2e-16 ***
prop_biased        -10.4790     0.2134  -49.097   <2e-16 ***
bias_strength        5.8927     0.2134   27.609   <2e-16 ***
size_net           -49.4204     0.2134 -231.547   <2e-16 ***
learnersSAM          6.0030     0.4269   14.063   <2e-16 ***
networkScale-free   59.2455     0.5228  113.322   <2e-16 ***
networkSmall-world   4.5583     0.5228    8.719   <2e-16 ***
influencers_biased   0.2215     0.2134    1.038    0.299    
init_lang            2.2276     0.2134   10.437   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 73.94 on 119991 degrees of freedom
Multiple R-squared:  0.3783,    Adjusted R-squared:  0.3782 
F-statistic:  9125 on 8 and 119991 DF,  p-value: < 2.2e-16

Plot:

**Figure 19.** Effect of different variables on the the stabilization time (for final agents).

Figure 19. Effect of different variables on the the stabilization time (for final agents).

With stabilization only for biased agents


Call:
lm(formula = stab_biased ~ prop_biased + bias_strength + size_net + 
    learners + network + influencers_biased + init_lang, data = subdata_lm)

Residuals:
    Min      1Q  Median      3Q     Max 
-338.20  -74.38  -18.61   52.26  702.12 

Coefficients:
                   Estimate Std. Error  t value Pr(>|t|)    
(Intercept)        107.8115     0.7444  144.826  < 2e-16 ***
prop_biased        -68.2944     0.3905 -174.905  < 2e-16 ***
bias_strength      -27.0553     0.3685  -73.417  < 2e-16 ***
size_net           -33.6302     0.3685  -91.259  < 2e-16 ***
learnersSAM          5.7092     0.7370    7.746 9.56e-15 ***
networkScale-free  135.3545     0.9027  149.950  < 2e-16 ***
networkSmall-world  41.4951     0.9027   45.969  < 2e-16 ***
influencers_biased  -6.5393     0.3685  -17.745  < 2e-16 ***
init_lang           19.3457     0.3685   52.497  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 114.2 on 95991 degrees of freedom
  (24000 observations deleted due to missingness)
Multiple R-squared:  0.4253,    Adjusted R-squared:  0.4253 
F-statistic:  8881 on 8 and 95991 DF,  p-value: < 2.2e-16

Plot:

**Figure 20.** Effect of different variables on the the stabilization time (for biased agents).

Figure 20. Effect of different variables on the the stabilization time (for biased agents).

With stabilization only for unbiased agents


Call:
lm(formula = stab_control ~ prop_biased + bias_strength + size_net + 
    learners + network + influencers_biased + init_lang, data = subdata_lm)

Residuals:
    Min      1Q  Median      3Q     Max 
-231.40  -54.20  -11.78   37.10  698.67 

Coefficients:
                    Estimate Std. Error  t value Pr(>|t|)    
(Intercept)         90.87389    0.57480  158.097  < 2e-16 ***
prop_biased         32.81796    0.49187   66.721  < 2e-16 ***
bias_strength      -12.89537    0.26651  -48.385  < 2e-16 ***
size_net           -48.30404    0.26651 -181.244  < 2e-16 ***
learnersSAM          3.86354    0.53302    7.248 4.25e-13 ***
networkScale-free   94.28112    0.65282  144.422  < 2e-16 ***
networkSmall-world  10.57628    0.65282   16.201  < 2e-16 ***
influencers_biased   0.06573    0.26651    0.247    0.805    
init_lang            7.72047    0.26651   28.968  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 82.58 on 95991 degrees of freedom
  (24000 observations deleted due to missingness)
Multiple R-squared:  0.4059,    Adjusted R-squared:  0.4058 
F-statistic:  8197 on 8 and 95991 DF,  p-value: < 2.2e-16

Plot:

**Figure 21.** Effect of different variables on the the stabilization time (for unbiased agents).

Figure 21. Effect of different variables on the the stabilization time (for unbiased agents).

Conclusion:

All variables have a statistically significant effect on the language value for the stabilization value of biased and unbiased agents.

However, only network have a quite big effect size, especially for scale-free networks. The variables size_net, prop_biased and bias_strength (to a lesser extent) might have a small effect on language value too. The effect of learners, init_lang and influencers_biased is negligible.

Plot specific variables

Size and network

**Figure 22.** Stabilization time for the biased and the unbiased agents (color), in different types of networks (columns) with two different sizes (rows), for various bias frequencies and strength (horizontal axis). The agents are SAM, there are no biased influencers, and there is an initial language.

Figure 22. Stabilization time for the biased and the unbiased agents (color), in different types of networks (columns) with two different sizes (rows), for various bias frequencies and strength (horizontal axis). The agents are SAM, there are no biased influencers, and there is an initial language.

Focus on biased agents

Here, we study the stabilization time for biased agents only:

**Figure 23.** Effect of network type and size on stabilization time for biased agents only (SAM, no influencers, initial language).

Figure 23. Effect of network type and size on stabilization time for biased agents only (SAM, no influencers, initial language).

Conclusion:

  • Interaction between size_net and network: in random networks, agents stabilize faster when the network is big, while in scale-free and small-world networks, stabilization takes the same amount of time in big versus small networks.

  • In general, the language value (for biased and unbiased agents) stabilizes faster in networks with weakly biased agents.

  • In general, scale-free networks are the longer to stabilize.

  • Biased agents need more time to stabilize when only a small percentage of agents is strongly biased (10%).

Dissemination

We study here the dissemination of results across the 100 replications. The higher the dissemination, the more different the results of language value at final tick across replications.

Regression

For whole population


Call:
lm(formula = std_cond_all ~ prop_biased + bias_strength + size_net + 
    learners + network + influencers_biased + init_lang, data = data_dissemination_lm)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.046268 -0.019296 -0.006047  0.011810  0.159456 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)         8.229e-02  2.642e-03  31.144  < 2e-16 ***
prop_biased        -3.664e-04  2.363e-05 -15.508  < 2e-16 ***
bias_strength       3.734e-02  3.349e-03  11.149  < 2e-16 ***
size_net           -4.872e-05  2.254e-06 -21.617  < 2e-16 ***
learnersSAM        -5.574e-03  1.675e-03  -3.328 0.000901 ***
networkScale-free  -1.414e-02  2.051e-03  -6.893 8.86e-12 ***
networkSmall-world -1.193e-03  2.051e-03  -0.581 0.561059    
influencers_biased -1.004e-04  1.675e-04  -0.600 0.548792    
init_lang          -4.061e-03  4.187e-04  -9.699  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.02901 on 1191 degrees of freedom
Multiple R-squared:  0.4554,    Adjusted R-squared:  0.4518 
F-statistic: 124.5 on 8 and 1191 DF,  p-value: < 2.2e-16

Plot:

**Figure 24.** Effect of different variables on dissemination (for all agents).

Figure 24. Effect of different variables on dissemination (for all agents).

For biased agents


Call:
lm(formula = std_cond_biased ~ prop_biased + bias_strength + 
    size_net + learners + network + influencers_biased + init_lang, 
    data = data_dissemination_lm)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.048238 -0.018171 -0.006018  0.010246  0.139707 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)         7.275e-02  2.871e-03  25.336  < 2e-16 ***
prop_biased        -3.932e-04  2.609e-05 -15.068  < 2e-16 ***
bias_strength       5.567e-02  3.491e-03  15.948  < 2e-16 ***
size_net           -4.517e-05  2.349e-06 -19.231  < 2e-16 ***
learnersSAM        -4.173e-03  1.745e-03  -2.391    0.017 *  
networkScale-free  -1.566e-02  2.138e-03  -7.325 5.10e-13 ***
networkSmall-world -3.469e-03  2.138e-03  -1.623    0.105    
influencers_biased -4.018e-05  1.745e-04  -0.230    0.818    
init_lang          -2.162e-03  4.364e-04  -4.953 8.62e-07 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.02704 on 951 degrees of freedom
Multiple R-squared:  0.4973,    Adjusted R-squared:  0.4931 
F-statistic: 117.6 on 8 and 951 DF,  p-value: < 2.2e-16

Plot:

**Figure 25.** Effect of different variables on on dissemination (for biased agents).

Figure 25. Effect of different variables on on dissemination (for biased agents).

For unbiased agents


Call:
lm(formula = std_cond_unbiased ~ prop_biased + bias_strength + 
    size_net + learners + network + influencers_biased + init_lang, 
    data = data_dissemination_lm)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.052791 -0.021432 -0.007021  0.013391  0.149992 

Coefficients:
                     Estimate Std. Error t value Pr(>|t|)    
(Intercept)         9.752e-02  3.195e-03  30.523  < 2e-16 ***
prop_biased        -3.539e-04  5.215e-05  -6.787 2.01e-11 ***
bias_strength       2.502e-02  4.005e-03   6.247 6.29e-10 ***
size_net           -5.849e-05  2.695e-06 -21.705  < 2e-16 ***
learnersSAM        -7.230e-03  2.003e-03  -3.610 0.000322 ***
networkScale-free  -1.800e-02  2.453e-03  -7.339 4.62e-13 ***
networkSmall-world -5.627e-03  2.453e-03  -2.294 0.021990 *  
influencers_biased -7.619e-05  2.003e-04  -0.380 0.703699    
init_lang          -5.227e-03  5.007e-04 -10.440  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.03103 on 951 degrees of freedom
Multiple R-squared:  0.4359,    Adjusted R-squared:  0.4311 
F-statistic: 91.84 on 8 and 951 DF,  p-value: < 2.2e-16

Plot:

**Figure 26.** Effect of different variables on dissemination (for unbiased agents).

Figure 26. Effect of different variables on dissemination (for unbiased agents).

Conclusion:

All variables have a statistically significant effect on the language value for the stabilization value of biased and unbiased agents. However, only bias_strength and network have a quite big effect size, especially for scale-free networks. The variables learners and init_lang might have a small effect on language value too. The effect of size_net, prop_biased and influencers_biased is negligible.

Plot specific variables

The dissemination of biased and unbiased agents is approximately the same as the dissemination results for all agents: thus, we will plot only the dissemination for all agents in the following plot.

**Figure 27.** Effect of network type and size on dissemination (SAM, no influencers, initial language).

Figure 27. Effect of network type and size on dissemination (SAM, no influencers, initial language).

Conclusion:

  • the dissemination of results is always higher when the agents were not initialized with an initial language of the society ;

  • dissemination of replications in random networks is higher compared to scale-free and small-world networks;

  • in random network, the dissemination of results increases when the bias is weak.

Linguistic communities and heterogeneity

Here, we use two variables :

  • hetero_inter_group : heterogeneity between linguistic communities
  • hetero_intra_group : : heterogeneity within linguistic communities

Please note that when we refer to communities here, we always refer to the linguistic communities. The structural communities detected by Louvain algorithm are fixed at the beginning of the network (see Community detection).

Heterogeneity between communities

Regression


Call:
lm(formula = hetero_inter_group * 100 ~ prop_biased + bias_strength + 
    size_net + learners + network + influencers_biased + init_lang, 
    data = subdata_lm)

Residuals:
    Min      1Q  Median      3Q     Max 
-8.5935 -1.7340 -0.2951  1.3616 30.7963 

Coefficients:
                    Estimate Std. Error  t value Pr(>|t|)    
(Intercept)         1.089515   0.019199   56.750  < 2e-16 ***
prop_biased        -1.063898   0.009005 -118.142  < 2e-16 ***
bias_strength       0.051295   0.009005    5.696 1.23e-08 ***
size_net            0.182980   0.009003   20.323  < 2e-16 ***
learnersSAM        -0.397205   0.018011  -22.054  < 2e-16 ***
networkScale-free   6.135480   0.022683  270.485  < 2e-16 ***
networkSmall-world  2.611845   0.022687  115.123  < 2e-16 ***
influencers_biased  0.144424   0.009005   16.037  < 2e-16 ***
init_lang          -0.241280   0.009005  -26.793  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 3.012 on 111837 degrees of freedom
  (8154 observations deleted due to missingness)
Multiple R-squared:  0.4473,    Adjusted R-squared:  0.4472 
F-statistic: 1.131e+04 on 8 and 111837 DF,  p-value: < 2.2e-16

Plot regression

**Figure 28.** Effect of different variables on heterogeneity between communities.

Figure 28. Effect of different variables on heterogeneity between communities.

The network type has a very strong impact on heterogeneity between groups. The proportion of biased agents also has a small effect on it, and there also might be a small difference between SAM and MAP learners. Finally, it seems that the size of network, strength of the bias, the influencers biased and the initial language have a negligible impact on heterogeneity between groups.

Plot specific variables

Is there a difference between SAM and MAP learners?

**Figure 29.** Effect of network type and size on heterogeneity between communities (SAM, no influencers, initial language).

Figure 29. Effect of network type and size on heterogeneity between communities (SAM, no influencers, initial language).

Is there a difference when agents are weakly versus strongly biased?

**Figure 30.** The difference in heterogeneity between linguistic communities function of network type (columns) and size (colors), and bias strength (rows) and frequency (horizontal axis). The networks contain SAM agents, no influencers are biased, and there is an initial language.

Figure 30. The difference in heterogeneity between linguistic communities function of network type (columns) and size (colors), and bias strength (rows) and frequency (horizontal axis). The networks contain SAM agents, no influencers are biased, and there is an initial language.

Statistics

Hypothesis

Our hypothesis is that the presence of biased agents in the population increase the heterogeneity between communities.

Wilcoxon test

In order to test this hypothesis, we performed unpaired Wilcoxon (unpaired) tests for all possible combinations of parameters, comparing:

  • the heterogeneity of unbiased agents in a population with biased agents
  • the heterogeneity of unbiased agents in a population without biased agents

The goal is to check whether in some condition, these languages values are not statistically significant. We corrected the p-values for multiple testing using the Bonferroni method:

  • In scale-free networks containing strongly biased agents, 13 out of 96 cases are not significant.

  • In scale-free networks containing weakly biased agents, 43 out of 96 cases are not significant.

  • In small-world networks containing strongly biased agents, 24 out of 96 cases are not significant.

  • In small-world networks containing weakly biased agents, 52 out of 96 cases are not significant.

  • In random networks containing strongly biased agents, 41 out of 96 cases are not significant.

  • In random networks containing weakly biased agents, 76 out of 96 cases are not significant.

These adjusted p-values show that, in scale-free networks with a strong bias, having biased agents in the network significantly affects the emergence of linguistic communities; this is also true, to a smaller extent, for small-world networks with strongly biased agents . However, in scale-free and small-world networks containing weakly biased agents, only about half of the time the comparisons are significant; thus, the heterogeneity observed in small-world and scale-free network containing weakly biased nodes is probably mostly due to the structure of the network itself.

Conclusion

  • There is a clear effect of network type on heterogeneity between group: heterogeneity is higher for scale-free than for small-world networks, and higher for small-world compared to random networks.

  • The size of the network does not impact heterogeneity in scale-free and small-world, but impacts random networks: small random networks have an higher heterogeneity compared to big ones.

  • Finally, heterogeneity between groups is higher when the network has some diversity, and contains both biased agents and unbiased agents.

Heterogeneity within communities

Regression


Call:
lm(formula = hetero_intra_group * 100 ~ prop_biased + bias_strength + 
    size_net + learners + network + influencers_biased + init_lang, 
    data = subdata_lm)

Residuals:
    Min      1Q  Median      3Q     Max 
-3.3273 -1.1037 -0.0738  1.0132  8.2195 

Coefficients:
                    Estimate Std. Error  t value Pr(>|t|)    
(Intercept)         0.427119   0.009426   45.314  < 2e-16 ***
prop_biased        -0.492532   0.004419 -111.459  < 2e-16 ***
bias_strength      -0.022906   0.004419   -5.184 2.18e-07 ***
size_net            0.868085   0.004419  196.459  < 2e-16 ***
learnersSAM        -0.157437   0.008838  -17.814  < 2e-16 ***
networkScale-free   3.078552   0.011135  276.468  < 2e-16 ***
networkSmall-world  2.132313   0.011135  191.491  < 2e-16 ***
influencers_biased  0.028787   0.004419    6.514 7.33e-11 ***
init_lang          -0.045729   0.004419  -10.348  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.479 on 111991 degrees of freedom
  (8000 observations deleted due to missingness)
Multiple R-squared:  0.5183,    Adjusted R-squared:  0.5182 
F-statistic: 1.506e+04 on 8 and 111991 DF,  p-value: < 2.2e-16

Plot regression

**Figure 31.** Effect of different variables on heterogeneity within communities.

Figure 31. Effect of different variables on heterogeneity within communities.

The network type has a very strong impact on heterogeneity within groups. The proportion of biased agents and the size of the network also has a small effect on it, and there also might be a small difference between SAM and MAP learners. Finally, it seems that the strength of the bias, the influencers biased and the initial language have a negligible impact on heterogeneity within groups.

Plot specific variables

**Figure 32.** Effect of network type and size on heterogeneity within communities (SAM, no influencers, initial language).

Figure 32. Effect of network type and size on heterogeneity within communities (SAM, no influencers, initial language).

Conclusion:

  • The heterogeneity within groups is higher for scale-free networks, compared to small-world networks; small-world network also have in average a higher heterogeneity within groups compared to random networks.

  • In random networks, heterogeneity within groups decreases with size, whereas it increases with size in small-world and scale-free networks.

  • Finally, heterogeneity within groups is higher when the network has some diversity: biased agents and unbiased agents.

Conclusion on heterogeneity between and within groups:

  • Agents in random networks have a very homogenous language value; they share almost all the same language value. In contrary, agents in small-world and scale-free networks are more heterogeneous, which corrobate the results found in Difference between unbiased and biased agents, and can explain why the heterogeneity within groups is high in these networks.

  • Furthermore, heterogeneity between linguistic communities seems to naturally emerge in scale-free and small-world networks even with agents who are not too strongly biased; moreover, strongly biased agents amplify the language differences between linguistic communities in scale-free networks.

Systematic bias effect study

The following figure shows the joint influence of the proportion of biased agents and the strength of the bias on the population’s language value for the set of values in the “Systematic bias effects study” (see Set of combination 2 - extra_analysis.csv). We decided to further investigate the effect of these two parameters due of their large effect sizes. In this study, we ran 50 independent replications for each of all the possible combinations of:

  • the bias strength (going from 0.0=very strongly biased to 1.0=very weakly biased, in steps of 0.01) and
  • the proportion of biased agents in the population (going from 0% to 100% in steps of 1%).
**Figure 33.** Limit for language value according to bias strength and % of biased people - aggregated by condition.

Figure 33. Limit for language value according to bias strength and % of biased people - aggregated by condition.

For each replication, we computed the mean language value of the population after 500 iterations, and we then averaged the 50 independent replications for each combination by taking their mean: for example, the averaged mean language value of the population for the condition {bias_strength=0.70 & prop_biased=35} is 0.67, but is 0.22 for the condition {bias_strength=0.15 & prop_biased=80}. In general, the aggregated mean language value progressively increases with the proportion of biased agents and the strength of the bias.

In order to better visualize the shape of the relationship between the bias strength and frequency (i.e., linear or not), and also to check if the proportion of biased influencers impacts the results, we also show the set of isolines for the mean language value of the population. These isolines are defined as the maximum values of the combination of bias_strength and prop_biased for a given set of language values.

**Figure 34.** Limit for language value according to bias strength and % of biased people - aggregated by condition.

Figure 34. Limit for language value according to bias strength and % of biased people - aggregated by condition.

Interestingly, the relationship between the strength of the bias and the proportion of biased agents is relatively linear when the proportion of biased agents is high and/or when the bias is strong, but becomes nonlinear for low frequencies of the biased agents and for weak biases. In this latter case, the effect of biased agents on the language value of the population is much stronger than expected.

Moreover, this analysis helps understanding under what conditions an initial language strongly favouring “1” may change to a language favouring the variant “0”: while only in populations with a large proportion of strongly biased agents (> 50%) does the language strongly favour “0” (a language value of 0.2), it is enough for only 15%-20% of the populations to have a strong bias for the language to reach a moderate preference for “0” (language value of 0.4). However, please note that while these particular values critically depend on the the initial language (i.e., the number of initial utterances and the distribution of “0” and “1” utterances), they do support qualitative inferences concerning the influence of biased agents in a population.

We also plot the same type of graphs except that we selected the maximum language value over the 50 replication instead of the mean aggregated value of the 50 replications:

**Figure 35.** Limit for language value according to bias strength and % of biased people - maximum value

Figure 35. Limit for language value according to bias strength and % of biased people - maximum value

Appendix: Netlogo guide

BehaviorSpace

BehaviorSpace is a software tool integrated with NetLogo that allows you to perform experiments with models. BehaviorSpace runs a model many times, systematically varying the model’s settings and recording the results of each model run. This process lets you explore the model’s “space” of possible behaviors and determine which combinations of settings cause the behaviors of interest. After opening the Netlogo model, you can use BehaviorSpace in «Tools», «BehaviourSpace». BehaviorSpace use is quite intuitive and will not be explained here. In the next part, I will explain the role of each variable in our Netlogo’s model.

Explaination of dynamicity

Generating a network

Before generating a network, it is necessary to setup-clear the environment.

Num-agents.

Then, you can set the number of agents you want in this network (num-agents). In theory, you can select any number of agents > 2. In practice, data visualization will not be efficient for more than ~ 500 agents. In BehaviorSpace, and depending on the computer used, you can go up to 1000 agents without any computational problems, and 2000 agents if the number of conditions is not too high.

Note: if you want to select a very low number of agents (< 5), be careful to select the option « On » for choose-N-influent. Otherwise, if selected « Off », the program will try to give more 2 times more bias for central agents, but as it is impossible with 2 agents, it will keep looking for alternatives and eventually crash.

Directed/Undirected links.

In Other parameters, select « directed » or « undirected » in the box « links-to-use ». The default option is undirected. Please note that some additional options (centrality check, …) have been created only for undirected networks; if you want to use directed agents, there might be some problems (will require some more coding :) )

Synchronicity.

Selecting the option “on” will make the network synchroneous; in the off mode, the network will be asynchroneous. More concretely, in a synchroneous network, the new language value of all nodes will be updated at the same time at the end of the tick, after all agents have talked once. In an asynchroneous network, the language value of the agents will be updated immediately after they hear the utterance.

Layout.

The layout affects the disposition of the agents for data visualization. It is useless to select this parameter in BehaviorSpace. You can select 4 layout options: « spring », « circle », « radial », « tutte ».

It is recommended to use the following parameters: radial for Scale-free, wheel and star network, circle for Random, small-world and ring networks. Spring layout can be a good alternative when plotting many types of networks.

The two buttons « layout » and « layout once » allow you to reset the layout once the network has been created. « Layout » will continuously reset the layout (useful for dynamic network) whereas « layout once » will reset the layout only once.

Dynamicity.

On the bottom left, there are several boxes under the note « Dynamic network parameters ». The default option is a static network, namely a network where proba-rewire = 0.

Proba-rewire determines the probability to rewire, at each tick and for each agent. The total number of links remains the same throughout the process. The algorithm was built using the following algorithm:

  • for each agent, select a Random float R between 0 and 1: if R < proba-rewire, then start rewiring process for the selected agent i.

  • affect values for all agents in the network according to the selected agent \(i\):

    • the distance (=shortest path) between each agent and agent i. If no possible path exists, value 0 is affected.
    • the similarity of the language value between each agent and agent i. Here, similarity = absolute value of (language value of agent i - language value of other agents).
    • the centrality of agents (do not depend on agent i). Now, it uses betweenness centrality measure but this can be easily changed for other centrality measures.
  • normalize centralize these numbers so all measures vary between 0 and 1.

  • compute the overall probability of agent i to rewire with other agents. It uses the following formula: **(distanceimport-dist + similarity-language-valueimport-lang + centrality*import-central) / (import-dist + import-central + import-lang)** The values for import-dist, import-lang and import-central are manually selected by the user and represent weights that can be affected to each previously computed probability: for example, one can select higher importance of connecting with agents having a high centrality.

  • Then, it selects Randomly a agent \(p\) according to this probability, randomly remove one link of agent \(i\), and wire the agent \(i\) with agent \(p\).

Explaination of dynamicity

In this example, agent \(i\) has a probability of 0.9 to rewire with agent \(G\), and a probability of 0.175 to rewire with agent \(H\). As agent \(i\) only has one neighbor (agent \(F\)), the link between agent \(i\) and agent \(F\) will be removed and we can imagine that a new link between agent \(i\) and \(G\) will be created.

Please note that dynamicity breaks the structure of scale-free network. Setting a high value for import-central will slow down this process but it will eventually look like a Random network after many ticks.

Network generation

Please note that in BehaviorSpace, whereas all variables (num-agents, etc) must be entered in the first box « vary variables », the network generation must be entered in setup commands.

So, in setup commands, you must enter first setup-clear and then the name of the network you want to generate (Random for example). You have several options :

  • Random network: implies to select the connection-prob
  • Small-world network: according to watts-strogatz algorithm. Implies to select the neighborhood-size (the number of neighbors with whom each agent is connected at first) and the rewire-prob (the probability for each agent of rewiring after the initial condition has been set up).
  • preferential attachment: Scale-free network, according to Barabasi algorithm.
  • ring: each agent is connected to only one agent. Exactly the same as a scale-free with neighborhood-size=1 and rewire-prob = 0.
  • star: a central agent exists, relied with all agents. Other agents are only connected with this central agent.
  • wheel: same as star, except that not-central agents are also connected with their neighbors.
  • there are also other options such as small-world lattice (lattice-2d, kleinberg) algorithms, but this option is not perfectly working now (it does not fit to the number of agents previously entered but with the options nb-rows, nb-cols, and clustering-exponent).

More info at here.

The network has been created! Please note that default values were used for the settings of the language value, bias, etc.

There are also buttons to detect communities, to show agents centralities, show clusters and save matrix.

Affect initial language values

The internal value of the language is continuous and goes from 0 (against the feature) to 1 (pro the feature). Utterances are binary: with the feature (=1) and without the feature (=0).

Initial value of the society’s language

This part applies only to Bayesian communication algorithms. First of all, you need to select the initial value of the society’s language. Before it starts, each agent will hear a fixed number of utterances (total-numb-utt). If you do not want to have any initial value of the society language, affect 0 to total-numb-utt.

Then, you need to select the number of utterance = 1 heard (number-utt-heard-start) among the N total-numb-utt. For example, for non-biased agents, if they hear 3 utterances = 1 (number-utt-heard-start = 3) out of 4 utterances (total-numb-utt), these non-biased agents will start with an a-priori language tending toward possessing the feature.

Random initial value language

The agents can start with a language value which is determined by the bias (Random-initial-lang-value = Off), or with chosen Random initial values for the language (Random-initial-lang-value = On).

For Bayesian algorithm, we recommend to start with language value determined by the bias. For non-Bayesian communication algorithm, such as probabilistic algorithm, setting this value to « on » allows settling the initial value of the society language. Now, it is only possible to affect values according to a normal distribution but it can easily be changed for other types of distributions. You can set initial-value-language and the standard-dev of the normal distribution; if you want all agents to have the same value, set standard-dev = 0.

For non Bayesian communication algorithms, it is also possible to set the values for a unbiased value of the feature. We recommend affecting the same value to the unbiased.

Redistribute/reset buttons

If you want to re-use exactly the same language-values as the one you have just used, use reset language value.

If you want to redistribute language values in the same network (give new language values), use the button redistribute language values.

Affect internal bias

First of all, you need to select if you want binary or continuous bias. If you want to split the population into 2 parts: agents biased with a X bias and agents biased with a Y bias (could be non-biased), use binary bias. If you want the population to be continuously biased on a range from 0 to 1, use continuous bias.

For now, with Bayesian communication algorithms, the program only handles binary bias.

Continuous bias (only for non-Bayesian)

You have two options: either set Random continuous states (is-state-Random = On) or manually choose the distribution of the continuous states (is-state-Random = Off).

If you set this option to On, there is no need to fill the other boxes below.

If you set this option to Off, you select the shape of the distribution for the states (distribution-state). Then, according to the distribution chosen, you can select the parameters of the curve. For example, if you selected a gamma distribution, you only need to select the options « alpha-if-gamma » and « lambda-if-gamma ».

The last option (exactly-same-for-unbiased) will affect the same values for the unbiased bias. Now, the option for which you can manually select the values for the unbiased bias has not been coded yet. Please note that you can visualize the distribution of the state in the graph below.

Binary bias (for all)

Use this if you want to split the population into 2 parts.

First, affect the value of the bias for population 1 (init-langval-1) and the value of the bias for population 0 (init-langval-0). Please note that if you want to have a biased population and a non-biased population, the biased population must always be the population 1. Indeed, the centrality measures are computed with this population.

Second, select the percentage of agents in population 1 (percent-state-1). There is also a unbiased bias for which you can select an other percentage unbiased. The bias unbiased has not been coded for Bayesian algorithms.

Third, select how you want the bias to be distributed according to the agents’ centrality. There are two options: select the bias according to the most influential agents (choose-N-influent = On), or select the bias according to the ratio of centrality (choose-N-influent=Off).

choose-N-influent = On: You can use this option with any type of network. If you select this option, you must enter the number of the most influent agent that will have the bias init-langval-1. This number must always be inferior to number of agents in population 1 (percent-state-1 * num-agents): in Netlogo, the cursor will automatically adapt but in BehaviorSpace, make sure not to enter a too high number. The algorithm works this way:

  • affect init-langval-0 to all agents
  • find most influential agents (now, it uses eigenvector centrality but it can be easily changed in the code)
  • affect init-langval-1 to the N-influent most influential agents selected by the user. For example, if this N-influent=2, the 2 most influential agents in the network will have init-langval-1.
  • affect randomly init-langval-1 to the rest of the agents, so that the number of agents with init-langval-1 fits the percent-state1.

choose-N-influent = Off: You can select this option only with Scale-free networks. If you select this option, you must enter the ratio-centrality-Scale-free that you want to see. This ratio means that the centrality of the agents with init-langval-1 must be X times higher than the mean centrality of the agents with init-langval-0. The algorithm works this way:

  • affect init-langval-0 to all agents
  • « while » loop :
    • randomly affect init-langval-1 in the network;
    • compute the mean centrality for agents with init-langval-1 and init-langval-0;
    • if the ratio mean centrality of population 1 agents = ratio-centrality-Scale-free * mean centrality of population 0 agents +/- 0.01 (we added 0.01 in order to have a loop which runs faster…), then break the loop. If not, redo everything.

The while loop works relatively fast thanks to the addition/subtraction of 0.01. For data visualization, it is easy to test a high ratio, such as 5 or 6. However, in BehaviorSpace, as we run many iterations, affecting a too high value for the ratio can make the program crash. We recommend to use ratio-centrality = 1 or 2.

If you don’t want to make this centrality parameter vary, you have 2 options:

  • set choose-N-influent = On and N-influent = 0: in this case, init-langval-1 will be affected randomly in the population;
  • set choose-N-influent = Off and mean-centrality = 1: this is different than the previous case, in that the mean centrality of agents with init-langval-1 and with init-langval-0 will be approximately the same. For many replications, the 2 options are equivalent but the first option is recommended, because computationally less expensive.

Same as for language-value, you can reset or redistribute states. Please note that you can visualize the mean centrality of agents with state = 1 and state = 0.

Choose communication algorithm

First, you need to select the communication algorithm. You have several options:

Individual algorithm

Each tick, affect to the agent the language value of one randomly selected neighbor.

(borrowed from Language Change model library)

Threshold algorithm

(works with threshold-val and sink-state-1 options):

  • sum the language value of all neighbors
  • if this sum is > the number of neighbors * threshold-val, then affect language-value to 1.
  • if you affected sink-state-1 to Off, and this sum is < the number of neigbors * threshold val, then affect language-value to 0.

(borrowed from Language Change model library)

Reward algorithm

(works with value-bias and logistic):

  • affect to agents either a value-bias = 0 for population 0, or = value-bias (manually selected by the experimenter) for population 1.

  • Create utterances according to the internal value-bias:

    • if logistic: create a value: (1 / (1 + exp ( - ((value-bias+0.1)20language-value – 1) * 5)) → basically, it means that if you are non biased (value-bias = 0), you produce utterance according to the internal value of your language, but if you’re biased, you tend to product more utterance = 1.
    • if not logistic: agents produce utterance only according to their internal language value (no bias toward the feature)
  • Listen to utterances spoken by neighbors: change your internal language value according to the neighbors’ utterances.

(borrowed from Language Change model library)

Probabilistic algorithm

(works with value-bias-for-state1 and value-bias-for-state-0 for binary option, and min-value-bias and max-value-bias for continuous option)

  • Speak: extract 30 utterances according to binomial function with the language-value as the probability. The spoken-state is the mean of the 30 utterances. Contrary to reward algorithm, speaking is not affected by the bias.

  • Listen using the following formula: 

    • ifelse heard-state >= language-value
    • new-lang-value = ( language-value + ( abs(language-value - heard-state) * value-bias-for-state-0 ) )
    • new-lang-value = ( language-value - ( abs(language-value - heard-state) * (value-bias-for-state-0 for non-biased agents; value-bias-for-state-1 for biased agents) ) )

Consequently, biased agents have a tendency to decrease less their internal value of language when they hear utterance lower than their internal value of the language compared to non-biased agents.

Bayesian algorithm

(works with learning-acceptance1, learning-acceptance0, init-langval-0 and init-langval-1)

  • The parameters learning-acceptance and init-langval-1 are used in another R script to generate alpha and beta values for a Beta distribution. The code works in Netlogo; but because it was often crashing with BehaviorSpace, I just extracted the alpha and beta value of interests and copy-pasted in Netlogo. According to those values, each agent is initialized with a Beta distribution than will be modified according to the utterances heard.

  • Speak: two options can be set:

    • MAP: - extract the value of the mode of the Beta distribution; - use the binomial function to extract an utterance from this probability
    • SAM: - extract a Random number from the Beta distribution; - use the binomial function to extract an utterance from this probability
  • Listen: the same for both MAP and SAM algorithms 

    • Update the Beta distribution using the following formula: - new alpha = alpha + utterance heard (= 0 or 1), - new beta = beta + 1 – utterance heard (= 0 or 1)

References

Albert, R. (2005). Scale-free networks in cell biology. Journal of Cell Science, 118(21), 4947–4957. https://doi.org/10.1242/jcs.02714

Albert, R., Jeong, H., & Barabási, A.-L. (1999). Diameter of the World-Wide Web. Nature, 401(6749), 130–131. https://doi.org/10.1038/43601

Barabási, A.-L., Albert, R., & Jeong, H. (2000). Scale-free characteristics of random networks: The topology of the world-wide web. Physica A: Statistical Mechanics and Its Applications, 281(1), 69–77. https://doi.org/10.1016/S0378-4371(00)00018-2

Dediu, D. (2008). The role of genetic biases in shaping language-genes correlations. Journal of Theoretical Biology, 254, 400–407. https://doi.org/doi:10.1016/j.jtbi.2008.05.028

Dediu, D. (2009). Genetic biasing through cultural transmission: Do simple Bayesian models of language evolution generalize? Journal of Theoretical Biology, 259(3), 552–561. https://doi.org/10.1016/j.jtbi.2009.04.004

Erdős, P., & Rényi, A. (1959). On Random Graphs I. Publicationes Mathematicae (Debrecen), 6, 290–297.

Griffiths, T. L., & Kalish, M. L. (2007). Language evolution by iterated learning with Bayesian agents. Cognitive Science, 31(3), 441–480. Retrieved from http://onlinelibrary.wiley.com/doi/10.1080/15326900701326576/full

Jannsen, R. (2018). Let the agent do the talking: On the influence of vocal tract anatomy on speech during ontogeny and glossogeny. Nijmegen.

Kenett, Y. N., Levy, O., Kenett, D. Y., Stanley, H. E., Faust, M., & Havlin, S. (2018). Flexibility of thought in high creative individuals represented by percolation analysis. Proceedings of the National Academy of Sciences, 115(5), 867–872. https://doi.org/10.1073/pnas.1717362115

Kirby, S., Dowman, M., & Griffiths, T. L. (2007). Innateness and culture in the evolution of language. Proc Natl Acad Sci U S A, 104(12), 5241–5245. https://doi.org/10.1073/pnas.0608222104

Kitsak, M., Gallos, L. K., Havlin, S., Liljeros, F., Muchnik, L., Stanley, H. E., & Makse, H. A. (2010). Identification of influential spreaders in complex networks. Nature Physics, 6(11), 888–893. https://doi.org/10.1038/nphys1746

Milgram, S. (1967). The small-world problem. Psychology Today, 1(1), 61–67. Retrieved from http://files.diario-de-bordo-redes-conecti.webnode.com/200000013-211982212c/AN%20EXPERIMENTAL%20STUDY%20by%20Travers%20and%20Milgram.pdf

Watts, D. J., & Strogatz, S. H. (1998). Collective dynamics of “small-world” networks. Nature, 393(6684), 440–442. https://doi.org/10.1038/30918